Counting monotonic subsequences in a data stream
نویسندگان
چکیده
We prove Ω(n) deterministic lower bounds for any streaming algorithm that exactly computes the number of inversions (2-dec-count) in a data stream of t elements where each element comes from [n] and t ≥ Ω(n). The proof uses a reduction argument and utilizes communication lower bounds for computation of disjointness. Our second result is Ω(n) lower bound on any algorithm that correctly computes k-dec-count in a data stream. Once again, we use a reduction argument to compute inversions using any algorithm that computes k-dec-count.
منابع مشابه
Efficient Identification of Common Subsequences from Big Data Streams Using Sliding Window Technique
We propose an efficient Frequent Sequence Stream algorithm for identifying the top k most frequent subsequences over big data streams. Our Sequence Stream algorithm gains its efficiency by its time complexity of linear time and very limited space complexity. With a pre-specified subsequence window size S and the k value, in very high probabilities, the Sequence Stream algorithm retrieve the top...
متن کاملMonotonic subsequences in dimensions higher than one
The 1935 result of Erdős and Szekeres that any sequence of ≥ n +1 real numbers contains a monotonic subsequence of ≥ n+ 1 terms has stimulated extensive further research, including a paper of J. B. Kruskal that defined an extension of monotonicity for higher dimensions. This paper provides a proof of a weakened form of Kruskal’s conjecture for 2-dimensional Euclidean space by showing that there...
متن کاملIncremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams
In recent years, new applications emerged that produce data streams, such as stock data and sensor networks. Therefore, finding frequent subsequences, or clusters of subsequences, in data streams is an essential task in data mining. Data streams are continuous in nature, unbounded in size and have a high arrival rate. Due to these characteristics, traditional clustering algorithms fail to effec...
متن کاملSimilarity Search for Multidimensional Data Sequences
Time-series data, which are a series of one-dimensional real numbers, have been studied in various database applications. In this paper, we extend the traditional similarity search methods on time-series data to support a multidimensional data sequence, such as a video stream. We investigate the problem of retrieving similar multidimensional data sequences from a large database. To prune irrele...
متن کاملFast Algorithm for the Analysis of the Presence of Short Oligonucleotide Subsequences in Genomic Sequences
Statistical analysis of the appearance of short subsequences in different DNA sequences, from individual genes to full genomes, is important for various reasons. Applications include PCR primers and microarray probes design. Moreover, the distribution of short subsequences (n-mers) in a genome can be used to distinguish between species with relatively short genome sizes (e.g., viruses and micro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017